Skip to main content

About the Provider

Moonshot AI is a Chinese AI research company focused on building large-scale foundation models with advanced agentic and multimodal capabilities. Kimi K2.5 is their most powerful open-source release, built through continual pretraining on 15 trillion mixed visual and text tokens, combining frontier reasoning, vision understanding, and multi-agent orchestration in a single model.

Model Quickstart

This section helps you quickly get started with the moonshotai/Kimi-K2.5 model on the Qubrid AI inferencing platform. To use this model, you need:
  • A valid Qubrid API key
  • Access to the Qubrid inference API
  • Basic knowledge of making API requests in your preferred language
Once authenticated with your API key, you can send inference requests to the moonshotai/Kimi-K2.5 model and receive responses based on your input prompts. Below are example placeholders showing how the model can be accessed using different programming environments.
You can choose the one that best fits your workflow.
from openai import OpenAI

# Initialize the OpenAI client with Qubrid base URL
client = OpenAI(
    base_url="https://platform.qubrid.com/v1",
    api_key="QUBRID_API_KEY",
)

# Create a streaming chat completion
stream = client.chat.completions.create(
    model="moonshotai/Kimi-K2.5",
    messages=[
      {
        "role": "user",
        "content": [
          {
            "type": "text",
            "text": "What is in this image? Describe the main elements."
          },
          {
            "type": "image_url",
            "image_url": {
              "url": "https://cdn.britannica.com/61/93061-050-99147DCE/Statue-of-Liberty-Island-New-York-Bay.jpg"
            }
          }
        ]
      }
    ],
    max_tokens=16384,
    temperature=1,
    top_p=0.95,
    stream=True
)

# If stream = False comment this out
for chunk in stream:
    if chunk.choices and chunk.choices[0].delta.content:
        print(chunk.choices[0].delta.content, end="", flush=True)
print("\n")

# If stream = True comment this out
print(stream.choices[0].message.content)

Model Overview

Kimi K2.5 is Moonshot AI’s most powerful open-source model to date — a native multimodal agentic model built through continual pretraining on 15 trillion mixed visual and text tokens atop Kimi-K2-Base.
  • With 1T total parameters and 32B active per token, it seamlessly integrates vision, language, and advanced agentic capabilities including an Agent Swarm paradigm that coordinates up to 100 parallel sub-agents, reducing execution time by 4.5x on parallelizable tasks.
  • It achieves 76.8% on SWE-bench Verified and 50.2% on HLE (Humanity’s Last Exam) at 76% lower cost than Claude Opus 4.5, with a 256K context window and support for both Thinking and Instant modes.

Model at a Glance

FeatureDetails
Model IDmoonshotai/Kimi-K2.5
ProviderMoonshot AI
ArchitectureSparse MoE Transformer — 1T total / 32B active per token, continual pretraining on 15T vision + text tokens
Model Size1T Total / 32B Active
Context Length256K Tokens
Release Date2025
LicenseApache 2.0
Training Data15 trillion mixed visual and text tokens; RL post-training for agentic and reasoning tasks

When to use?

You should consider using Kimi K2.5 if:
  • You need native multimodal agent workflows combining vision and language
  • Your application requires visual code generation from UI screenshots or video
  • You are building complex parallel tasks using Agent Swarm coordination
  • Your use case involves advanced web development with vision understanding
  • You need multimodal research and analysis at frontier scale
  • Your workflow requires image or video-to-code translation

Inference Parameters

Parameter NameTypeDefaultDescription
StreamingbooleantrueEnable streaming responses for real-time output.
Temperaturenumber1Recommended 1.0 for Thinking mode, 0.6 for Instant mode.
Max Tokensnumber16384Maximum number of tokens to generate.
Top Pnumber0.95Controls nucleus sampling.
ModeselectthinkingThinking mode enables deep reasoning traces. Instant mode provides fast direct responses.

Key Features

  • 76.8% SWE-bench Verified: Frontier-level software engineering performance at open-source scale.
  • 50.2% HLE (Humanity’s Last Exam): Achieves this at 76% lower cost than Claude Opus 4.5.
  • Agent Swarm: Coordinates up to 100 parallel sub-agents, reducing execution time by 4.5x on parallelizable tasks.
  • Native Multimodal: Jointly trained on 15T vision and text tokens — not a bolted-on vision encoder.
  • Thinking and Instant Modes: Configurable reasoning depth — deep chain-of-thought or fast direct responses.
  • 256K Context Window: Long-horizon document analysis and multi-turn agentic workflows.
  • Apache 2.0 License: Fully open source with full commercial freedom.

Summary

Kimi K2.5 is Moonshot AI’s flagship open-source multimodal agentic model, built for complex reasoning and parallel agent execution.
  • It uses a Sparse MoE architecture with 1T total and 32B active parameters, pretrained on 15 trillion mixed vision and text tokens.
  • It leads on SWE-bench Verified (76.8%) and HLE (50.2%) while delivering 76% cost savings over Claude Opus 4.5.
  • The model supports Agent Swarm with up to 100 parallel sub-agents, Thinking and Instant modes, and a 256K context window.
  • Licensed under Apache 2.0 for full commercial use.